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1. 


Investigations  on  Fluctuations  of  Sums  of  Random  variables# 


Abstract. 


Me  have  investigated  the  fluctuations  of  sums  of  random 
variables  X., ,  Xg,  ...  .We  have  generalized  previous  results 
on  the  random  variables  Hn  connected  with  this  sequence,  ob¬ 
tained  new  results  on  the  validity  of  the  Arc-sine  Law  for 
independent  not  identically  distributed  random  variables,  ob¬ 
tained  a  generalization  of  Spitzer's  identity,  and  obtained  a 
generalization  of  the  equivalence  principle.  Furthermore  Toep- 
litz  matrices  of  Laurent  polynomials  have  been  studied.  For  the 
growth  of  the  maximal  order  statistics  of  a  sequence  of  inde¬ 
pendent,  identically  distributed  random  variables  a  "Law  of 
the  iterated  logarithm"  has  been  obtained. 


i 
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Investigations  on  Fluctuations  of  Sums  of  Random  Variables. 

1_.  The  main  subject  for  work  under  the  Contract  has  been 
the  investigation  of  the  fluctuations  of  sums  of  random  vari¬ 
ables.  We  have  worked  on  different  problems  within  this  theory. 

2.  let  ,  Xg,  ....  be  a  sequence  of  random  variables 
and  let  s0  =  0,  Sfc  =  X.,  +  . . +  Xk  for  !c  =  1,  2,  ...  .  We 

have  investigated  the  behaviour  of  the  largest  convex  minorant 
sequence  k  ,  k  =  0 ,  I ,  s  • « ,  n  to  the  sequence  ,  ... o 

The  number  of  equalities  Sfc  =  Z  ,,  k  =  1 ,  , . . ,  n  is,  to  some 
extent,  a  measure  of  the  fluctuations  of  the  sequence  SQ,  . ,,Sn  , 
we  denote  this  number  by  Hn  .  (We  shall  define  Ho  as  0  ) . 

It  is  clear  that  Hn  is  a  random  variable,  the  distribution  of 
which  does  depend  only  on  the  distribution  of  ,  . . . ,  X  . 

The  meaning  of  Hn  may  be  clarified  by  considering  figure  1., 
where  the  points  have  coordinates  (k,Sk),  k  =  0,  ...,  10  . 


In  the  figure  H1o  has  the  value  6  .  It  follows  from  the 
figure  that  another  natural  statistics  similar  to  H  is  the 
number  of  straight  segments  in  the  convex  polygon.  We  denote 
this  number  by  Kn  ,  evidently  Hn  £  Kn  .  If  the  random  vari¬ 
ables  have  a  continuous  distribution,  then  P(H  =K  )  =  1  . 

n  n 


3. 


It  was  shown  by  Andersen,  [  3  J  ,  that  if  the  random  vari¬ 
ables  X.j ,  Xg,  ...  are  independent,  and  have  the  same  conti¬ 
nuous  distribution,  then 

(1)  H(s,t)  =  ?  2  P (H  =m) sntm  =  (1-s)-t,  Js|<1,  |t|<1  , 

n=o  m=o 

holds  for  the  double-generating  function  for  P(Hn=m).  (Actually 

in  3  3  "the  formula  looks  somewhat  different,  due  to  a  trivial 

change  in  the  definition  of  H  )  . 

n' 

Hie  main  result  obtained  for  H  under  the  Contract  is: 

.  n 

(cf.  Technical  Note  No,  1.) 

Theorem  1 .  Let  X.j,  X2,  ...  be  independent,  identically 
distributed  random  variables.  Let  x1 ,  x2,  ...  be  the  set  of 


real  numbers  x  for  which  P ( Sv=  k  •  x)  >  0 

for  some  k  >  0  , 

and  let 

(2) 

c.(s) 

=  £  P(S,  =  k  •  x.)sk  , 
k=1  K  ^ 

j  =  1 ,  2,  ... 

(3) 

co(s) 

Oo 

=  £  P(S,4  k  •  x.  for  j  = 

k=1  *  3 

1,  2,  ...)sk 

=  sO-s)-1  -  £  c.(s)  . 

3=1  3 

Then 

(4) 

H(s,t) 

=  £  £  P(H  =m)sntm 

n=o  m=o  n 

OQ 

=  exp(t  (  x^c  (x)dx)  •  TT(1-t+t*exp(-  (V^c .  (x)dx)  )_1  . 
Jo  j=1  J0  3 


Unfortunately  this  formula  is  much  too  complicated  to  be  useful. 
Even  in  the  case  where  the  random  variables  X1 ,  X2,  ...  have 
only  two  possible  values,  say  -1  and  -1  ,  each  with  probabi¬ 
lity  1/2  ,  it  has  not  been  possible  to  obtain  a  simple  result. 

It  was  shown  in  TNI  that  a  simpler  formula  may  be  obtained  if 
the  .common  distribution  of  X1 ,  X2,  ...  have  only  one  discon¬ 
tinuity-point.  For  this  case  also  the  doublegenerating  function 
of  P(Kn=m)  was  found. 
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3,  A  fundamental  result  in  the  theory  of  fluctuations  of 
sums  of  random  variables  is  the  so-called  Arc-sine  Law.  Let 
Nn  denote  the  number  of  positive  sums  among  , . . . ,  Sn  .  We 
say  that  the  Arc-sine  Law  holds  for  the  sequence  ,  Xgj... 
of  random  variables,  if 

(5)  lim  P(N  /n^x)  =  ^  arc  sin  0  i  x  i  1  . 

n-jroo  n  rr 

If  for  an  a  €  (0,1)  we  have 


(6)  lim  P(N n/n<x)  =  22LU* 

n-^oo  "  ~j 


rx 


ya_1(i-y) 


-a 


dy,  0  i  x  i  1 


then  we  say  that  the  generalized  Arc-sine  Law  holds  for  the  ran¬ 
dom  variables  ,  X2,  ...  .  For  a  =  1/2  the  right  hand  side 
of  (6)  reduces  to  the  right  hand  side  of  (5). 

Erdos  and  Kac,[l3]  ,  proved  that  the  Arc-sine  Law  holds 
with  a  =  1/2  if  the  random  variables  X^ ,  X2,  ...  are  inde¬ 
pendent,  have  mean  0  and  variance  1  and  obey  the  Central 
Limit  Theorem.  Andersen  proved  l_  3  Jthat  (6)  holds  if  the 
random  variables  X1 ,  X2,  ...  are  independent,  identically  di¬ 
stributed  and  if 

(7)  lim  P(S  >0)  =  a  . 

n-><*j 

It  was  proved  by  Spitzer  [_  24  3  that  (7)  can  be  replaced  by  the 
weaker  condition 

(8)  •  lim  ~  (P(S1>0)+...+P(S„>0))  =  a  . 

n-^*n  1 

It  is  not  known  whether  Spitzer1 s  result  is  really  stronger  than 
Andersen's  result,  since  it  is  doubtful  that  there  exist  se¬ 
quences  of  independent,  identically  distributed  random  variables 
for  which  the  limit  in  (7)  does  not  exiwt,  while  (8)  is  satis¬ 
fied. 
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The  result  of  Erdos  and  Kac  apparently  connects  the 
Arc-sine  Law  with  the  Central  Limit  Theorem,  whereas  the  re¬ 
sult  of  Andersen  shows  that  there  is  no  such  connection,  as 
it  is  possible  to  choose  for  the  common  distribution  of  the 
random  variables  ,  X2,  ...  any  symmetric,  non-degenerate 
distribution  and  thereby  obtain  the  Arc-sine  Law. 

It  is  fairly  easy  to  see  by  heuristic  arguments  that  the 
number  of  positive  sums  Nn  ought  to  follow  the  Arc-sine  Law, 
if  the  random  variables  X^ ,  X0,  ...  are  in  some  sense 
"asymptotically  identically  distributed".  We  have  investigated 
this  problem  under  the  Contract  without  having  obtained  a  ge¬ 
neral  result.  We  have,  however,  obtained  certain  results 
which  throw  some  light  on  the  problem. 

Theorem  2.  Let  X^,  Xp>  ...  be  a  sequence  of  independent 
random  variables.  If  this  sequence  is  periodic  in  the  sense 
that  there  exist  two  positive  integers  p  and  q  and  a  set 
of  distribution  functions  E1 (x) ,  ...,  Fp(x)  ,  not  all  degene¬ 
rate,  such  that  P(X„_i  x)  =  P(x),  lir<p  ,  if  n^q  ,  then 

Iip*r  1  r 

the  Arc-sine  Law  holds  for  X.,  Xp,  ...  if  lirn  P(S  >0)  =  1/2  . 

n— »oo  n 

Theorem  3.  Let  X.|,  X2,  ...  be  a  sequence  of  independent, 
normally  distributed  random  variables  with  mean  0  .  Then 


if  E(S^)  =  log  n 

> 

n  =  1 

,  2, 

. . . ,  we  have 

r  ° 

for 

x  < 

0 

lim  P(H  /nix)  = 

!  i 

for 

0  < 

x  <  1 

»co  n 

for 

1  < 

X 

(ii)  if  E(S^)  =  nbA(n),  n  =  1,  2,  ...,  where  b  >  0  and 
Aw  is  defined,  positive  and  monotone  in  0  <  x  < 
and  for  every  positive  constant  c  satisfies 

A  (cx)/A(x)  — 9  1  for  x — *  oo  , 

we  have 

lim  P (N  /nix)  =?(  (  dtW)  >  ^ixil  , 

n->°on  v  Jo  d  • 

where  X(t)  ,  0  i  t  i  1  ,  is  the  Wiener  Process, 
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(iii)  if  E(S^)  =  bn,  b 
lim  P(Hr/n|x)  = 


>  1 

,  n 

=  1  j 

2,  . . ,  we  have 

1 

'  0 

for 

x  <  i 

1 

t  1 

for 

i  <  x  . 

A  proof  of  Theorem  2  is  given  in  TN2,  Theorem  3  may  be 
proved  by  a  modification  of  the  invariance  principle  used  by 
Erdos  and  Kac  in  L  13  ]] .  Case  (i)  of  Theorem  3  is  due  to 
G.  Maruyama  f^22  ~]  ,  together  with  (iii)  it  represents  extreme 
case3  where  the  variance  of  Sn  increases  so  slowly  that  prac¬ 
tically  all  sums  have  the  same  sign,  or  so  fast  that  for  large 
li  approximately  50  ®/o  of  the  sums  are  positive.  Case  (iii) 
for  b  =  1  gives  the  usual  Arc-sine  Law,  for  other  values  of 
b  it  has  not  been  possible  to  evaluate 

P  f  j1  Jisign  XLtj.  dtlA<xj  . 

The  most  interesting  case  of  Theorem  3>  therefore,  is  case  (ii) 
with  b  =  1  .  It  shows  that  the  Arc-sine  Law  may  hold  for  a  se¬ 
quence  of  independent  random  variables  Xg,  even  if 

the  variance  of  X^  goes  to  infinity  like  (log  n)  ,  c  >  0 

for  n > oo  .  In  this  case  the  random  variables  ,  X2,  ••• 

cannot  be  said  to  be  "asymptotically  identically  distributed" 
in  the  usual  sense  of  the  words.  Prom  this,  and  also  from  The¬ 
orem  2,  it  follows  that  the  Arc-sine  Law  must  hold  for  a  se¬ 
quence  of  independent  random  variables  under  rather  weak  assump¬ 
tions  about  the  distribution  functions.  Several  unsuccessful 
attempts  have  been  made  to  prove  that  the  Arc-sine  Law  holds 
under  conditions  which  are  more  general  than  the  conditions  in 
Theorem  2,  or  case  (ii)  of  Theorem  5.  Our  investigations 
have  led  us  to  the  conjecture  that  the  generalized  Arc-sine  Law 
holds  for  the  independent  random  variables  X1 ,  Xg>  ...  with 
distribution  functions  (x)  ,  F2(x)>  -•••  ^ 

lim  P(S  >0)  =  a 
n— >  00 
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and  if  there  exists  a  sequence  of  distribution  functions 
G1  (x) ,  Go(x),  ...  such  that  to  any  £,  >0  we  can  find  an 
N(l)  ,  tending  to  infinity  when  $, — >  0  }  and  a  6  (£.)  >  0  , 
tending  to  0  when  £,  - #  0  such  that 

sup  |  P(S,  -Sji)  -  G  (x)  |  <  t. 

_g»<X<ao  K+n  K  n 

whenever  k  ^  0  ,  k  +  n  ^  N(£)  and  n  >  &  (£)N(t)  .  The  last 
condition  is  the  way  in  which  we  think  it  natural  in  this 
connection  to  express  that  the  variables  ,  X2,  ...  are 
"asymptotically  identically  distributed".  V.'e  have  not  been 
able  to  verify  this  conjecture. 

4.  A  well-known  result  in  the  theory  of  fluctuations  of 
sums  of  random  variables  is  Spitzer's  identity, Q  24^ , 


(9) 


j0fO»,o-S„>°n  *  0XP  V,  3P<shsn)«n  .  1  =  1 


<  1 


where  (p  (X,Y)  denotes  the  joint  characteristic  function  of 

the  random  variables  X  and  Y  ,  while  T  =  max(S„,31 , . . . ,3  ), 

n  j  o  o  I  ’  n 

and  S'  =  max(0,Sn)  -  Spitzer's  identity  is  valid  if  the  random 
variables  X1 ,  X2 ,  ...  are  independent  and  identically  distri¬ 
buted.  Several  proofs  are  known,  some  combinatorial  and  others 
relying  mostly  on  analysis  or  functional  analysis. 

Y/e  have  treated  a  generalisation  of  Spitzer's  formula  to 
symmetrically  dependent  random  variables.  The  random  variables 

X. j ,  ...,  X  are  said  to  be  symmetrically  dependent,  if  the 
joint  distribution  function  Fn(x1 , . .  .  ,xn)  =  P  (X^x-j , . . .  .Xnixn) 
is  a  symmetric  function  of  Xf,...,  x  .  The  random  variables 

X^ ,  Xg,  ...  are  said  to  be  symmetrically  dependent  if,  for  each 
positive  n  ,  the  random  variables  X1 ,  ...,  XR  are  symmetri¬ 
cally  dependent.  We  remark  that  if  the  random  variables 

XI ,  X2,  ...  are  symmetrically  dependent,  then  they  are  equiva¬ 
lent  according  to  the  terminology  used  by  de  Pinetti, Q  16~|. 
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It  was  shown  by  de  Finetti  that  it  is  possible  to  deduce  re¬ 
sults  for  equivalent  random  variables  from  results  for  indepen¬ 
dent  random  variables.  We  do,  however,  prefer  to  work  directly 
on  symmetrically  dependent  random  variables  for  two  reasons. 

One  is  that  the  random  variables  ,  . . . ,  Xn  may  be  symmetri¬ 
cally  dependent  although  it  is  not  possible  to  extend  the  finite 
sequence  X^ ,  . . . ,  Xn  to  an  infinite  sequence  X^ ,  ...,  Xn,Xn+^,.. 
such  that  the  infinite  sequence  of  random  variables  are  symme¬ 
trically  dependent.  We  therefore  in  some  cases  get  more  general 
results  when  we  work  directly  on  symmetrically  dependent  random 
variables.  The  other  is  that  the  combinatorial  methods,  we  use, 
seem  to  be  naturally  adopted  to  the  concept  of  symmetrically  de¬ 
pendent  random  variables. 

For  independent  random  variables  multiplication  of  characte¬ 
ristic  functions  corresponds  to  addition  of  random  variables. 

The  same  is  not  true  for  dependent  random  variables.  If,  however, 
a  set  of  symmetrically  dependent  random  variables  is  given,  then 
it  is  possible  to  introduce  a  symbolic  multiplication  of  cha¬ 
racteristic  functions  of  random  variables  which  are  functions 
of  the  given  set  of  symmetrically  dependent  random  variables. 

This  symbolic  multiplication  coincides  with  the  ordinary  multi¬ 
plication,  in  case  that  the  given  symmetrically  dependent  ran¬ 
dom  variables  are  independent  and  identically  distributed. 

Having  introduced  this  symbolic  multiplication  of  characteristic 
functions  the  definition  of  the  right-hand-side  of  (9)  follows 
by  application  of  the  exponential  series.  We  have  proved  by 
combinatorial  methods  that  (9)  holds  for  symmetrically  depen¬ 
dent  random  variables  X1 ,  X2,  ...  when  the  multiplication  of 
characteristic  functions  is  the  symbolic  multiplication  intro¬ 
duced  above, 

5.  The  equivalence  principle  in  the  theory  of  fluctuations 
of  sums  of  random  variables  states  that  the  number  Nn  of  posi¬ 
tive  sums  among  S^,  ...,  Sn  has  tjjie  same  distribution  as  the 

index  L  of  the  first  maximum,  if  the  random  variables 
11 

X^ ,  . ..,  X^  are  symmetrically  dependent.  We  have,  inL4  J, 
obtained  a  generalization  of  this  result.  The  random  variables 

Nn  k  ^  -^n  k  are  defined 
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Nn  k  =  number  of  indices  j (3=0, . . . ,k-1 ,k+1 , . . . ,n) 

for  which  S,  1  S,  if  j  <  k 
J  * 


or 

s3  >  Sk 

if 

j  >  k  , 

index  v 

(v=0 , . . . ,n) 

of 

the  sum 

for  which  exactly  k 

sums 

s3  (3=0, 

•  •  *  j  V—  1  9  V+  1  9 

•  •  •  i 

»n) 

satisfy 

S3  =  Sv  if 

j 

<  V 

or 

s3  >  sv  « 

j 

>  V  . 

We  remark  that  N  is  the  number  of  positive  sums  usually 
denoted  by  IIn  and  that  Ln  Q  ia  the  index  of  the  first  maxi¬ 
mum. 


Theorem  _4»  Let  ,  , , . ,  be  symmetrically  dependent 
random  variables  and  let  C  be  an  event  which  is  symmetric 
with  respect  to  X., ,  . . . ,  Xn  ,  then 


(10) 


P(»n!lc=j,C)  =  P(lnjk=J,0) 


k  =  0,  1 ,  . .  •  9  n 
«3  “  0  j  •  o  •  f  n  • 


The  proof  we  have  given  of  Theorem  4  in  L  4  3  is  based  on 

a  one-to-one  measure-preserving  of  the  n-dimensional  sample 
space  onto  itself  such  that  the  event  C  Nn  ^=v,c]  is 

mapped  onto  the  event  C  Ln>k=3  ,  L^^v^Cj  !  For  each  point  in 
the  sample  space  the  mapping  is  a  permutation  of  the  coordinates. 


6.  In  connection  with  the  work  on  fluctuations  of  sums  of 
random  variables  we  have  investigated  the  Toeplitz  matrices  of 
Laurent  polynomials.  The  result  obtained  are  given  in  TN4.  It 
was  c  njeetured  that  this  study  would  be  valuable  for  the  work 
on  extension  of  the  Arc— sine  Law  to  the  case  where  the  random 
variables  ,  Xg,  ...  form  a  stationary  Markov  chain.  We 
have,  however,  not  been  able  to  obtain  results  in  this  direction. 


]_.  For  a  sequence  of  independent,  identically  distributed 
random  variables  the  growth  of  the  maximum  of  the  first  n 
variables  have  been  investigated  and  a  "Law  of  the  iterated 
logarithm11  has  been  proved  in  TP5 . 

Theorem  5.  Let  X^,  Xg,  ...  be  independent,  identically 
distributed  random  variables,  with  distribution  function  F(x)  , 
Lei  A. i »  k  2*  •••  he  a  nondecreasing  sequence  of  real  num¬ 
bers  then 

P(lim  sup  fmax(X1 , . . .  ,Xn)  i  Xnl  )  =  0 
if 

?  (p(An))ni^-S  <  QO 

n=3  n  n 

furthermore 

P(lim  sup  fmax(X1 ,. . .  ,Xn)i  AnH  )  =  1 

if  the  sequence  (F(Xn))n,  n  =  1,  2,  ...  is  non-inj;reasing 

and  J 

<x> 

l  (P(  X  ))n  l2£i£S_n  =  00  • 

n=3  n  n 

The  proof  of  Theorem  5  is  based  on  a  generalization, ob¬ 
tained  in  Tf  5 ,  of  the  convergence  part  of  the  Borel-Cantelli 

lemmas. 


11. 
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